Take-home Exercise 4

Decoding Chaos

Author

Teo Suan Ern

Published

February 27, 2024

Modified

February 29, 2024

1. Overview


1.1 Project Brief

1.2 Project Objectives

2. Data Preparation


2.1 Install and launch R packages

The project uses p_load() of pacman package to check if the R packages are installed in the computer.

The following code chunk is used to install and launch the R packages.

Show code
pacman::p_load(tidyverse, kableExtra,
               leaflet, rmarkdown, knitr,
               highcharter, # timeseries highchart
               viridis, ggthemes,
               ggplot2, tidyr, dplyr, viridisLite, RColorBrewer,  
               calendR, # calendar
               lubridate, # convert date from char to date format
               wordcloud, tidytext, # word cloud
               ggforce, # boxplot
               countrycode, sf, spdep, tmap, leaflet, # geospatial
               tm, plotly)

2.2 Import Data

Show code
data <- read.csv("data/1900-01-01-2024-02-26-Southeast_Asia-Myanmar.csv")

2.3 Overview of the data

Dataset Structure

Use str() to check the structure of the data.

str(data)
'data.frame':   55574 obs. of  35 variables:
 $ event_id_cnty     : chr  "MMR56099" "MMR56222" "MMR56370" "MMR56376" ...
 $ event_date        : chr  "31-Dec-23" "31-Dec-23" "31-Dec-23" "31-Dec-23" ...
 $ year              : int  2023 2023 2023 2023 2023 2023 2023 2023 2023 2023 ...
 $ time_precision    : int  1 1 1 1 1 1 1 1 1 1 ...
 $ disorder_type     : chr  "Political violence" "Political violence" "Political violence" "Demonstrations" ...
 $ event_type        : chr  "Explosions/Remote violence" "Explosions/Remote violence" "Battles" "Protests" ...
 $ sub_event_type    : chr  "Shelling/artillery/missile attack" "Shelling/artillery/missile attack" "Armed clash" "Peaceful protest" ...
 $ actor1            : chr  "Military Forces of Myanmar (2021-)" "Military Forces of Myanmar (2021-)" "Phoenix DF: Phoenix Defense Force (Nattalin)" "Protesters (Myanmar)" ...
 $ assoc_actor_1     : chr  "" "" "" "" ...
 $ inter1            : int  1 1 3 6 1 1 3 1 2 1 ...
 $ actor2            : chr  "" "Civilians (Myanmar)" "Military Forces of Myanmar (2021-)" "" ...
 $ assoc_actor_2     : chr  "" "" "" "" ...
 $ inter2            : int  0 7 1 0 7 0 1 0 1 7 ...
 $ interaction       : int  10 17 13 60 17 10 13 10 12 17 ...
 $ civilian_targeting: chr  "" "Civilian targeting" "" "" ...
 $ iso               : int  104 104 104 104 104 104 104 104 104 104 ...
 $ region            : chr  "Southeast Asia" "Southeast Asia" "Southeast Asia" "Southeast Asia" ...
 $ country           : chr  "Myanmar" "Myanmar" "Myanmar" "Myanmar" ...
 $ admin1            : chr  "Mon" "Rakhine" "Bago-West" "Sagaing" ...
 $ admin2            : chr  "Mawlamyine" "Maungdaw" "Thayarwady" "Yinmarbin" ...
 $ admin3            : chr  "Ye" "Maungdaw" "Nattalin" "Salingyi" ...
 $ location          : chr  "Aing Shey" "Kaing Gyi (NaTaLa)" "Kyauk Pyoke" "Let Pa Taung" ...
 $ latitude          : num  15.3 20.7 18.6 22.1 18.6 ...
 $ longitude         : num  98 92.4 95.8 95.1 95.8 ...
 $ geo_precision     : int  1 2 2 2 1 1 1 2 2 1 ...
 $ source            : chr  "Democratic Voice of Burma" "Development Media Group; Narinjara News" "Khit Thit Media; Myanmar Pressphoto Agency" "Myanmar Labour News" ...
 $ source_scale      : chr  "National" "Subnational" "National" "National" ...
 $ notes             : chr  "On 31 December 2023, in Aing Shey village (Ye township, Mawlamyine district, Mon state), following a clash betw"| __truncated__ "On 31 December 2023, in Kaing Gyi (Mro) village (coded as Kaing Gyi (NaTaLa)) (Maungdaw township, Maungdaw dist"| __truncated__ "On 31 December 2023, near Kyauk Pyoke village (Nattalin township, Thayarwady district, Bago-West region), the P"| __truncated__ "On 31 December 2023, in the Let Pa Taung area of Salingyi township (Yinmarbin district, Sagaing region), protes"| __truncated__ ...
 $ fatalities        : int  0 0 4 0 0 0 3 0 0 0 ...
 $ tags              : chr  "" "" "" "crowd size=no report" ...
 $ timestamp         : int  1704831212 1704831213 1704831214 1704831214 1704831214 1704831216 1704831216 1704831216 1704831216 1704831216 ...
 $ population_1km    : int  NA NA NA 749 NA 178 6634 671 687 35292 ...
 $ population_2km    : int  NA NA NA 521 NA 135 19078 2197 654 85732 ...
 $ population_5km    : int  3081 NA NA 1358 NA NA 34396 3144 656 169473 ...
 $ population_best   : int  3081 NA NA 749 NA NA 34396 3144 656 85732 ...

Use colSums to check for missing values

missing_values <- colSums(is.na(data)) 

missing_values %>% kable()
x
event_id_cnty 0
event_date 0
year 0
time_precision 0
disorder_type 0
event_type 0
sub_event_type 0
actor1 0
assoc_actor_1 0
inter1 0
actor2 0
assoc_actor_2 0
inter2 0
interaction 0
civilian_targeting 0
iso 0
region 0
country 0
admin1 0
admin2 0
admin3 0
location 0
latitude 0
longitude 0
geo_precision 0
source 0
source_scale 0
notes 0
fatalities 0
tags 0
timestamp 0
population_1km 9827
population_2km 9996
population_5km 10318
population_best 20848

Use duplicate() to check for duplicates:

3. Data Wrangling


The flowchart diagram below provides an overview of the key variables used in this project.

flowchart TD

Use xxx convert to date format:

data$event_date <- dmy(data$event_date)

4. Initial Exploratory Data Analysis


4.1 Descriptive Statistics

Before proceeding with data visualisation, it is essential to be able to navigate the dataset of 55,574 observations with ease. This segment will help users identify or navigate through the dataset observations instead of scrolling through each observation one-by-one. The interactive datatable is created using DT package.

Design Features - Interactive Data Table
  • Display number of observations by selecting the dropdown (5, 10, 25, 50, 100 entries). This ensure that the observations will not span across the entire webpage.

  • View other pages of observations with “previous” or “next” button.

  • Search specific observations with the search bar for the occurence of a string/ numercial value in any column of an observation

  • Filter observations with the filter bar directly below column headers.

  • Column visibility allows user to select the columns that they are interested to view and hide the rest

Show code
DT::datatable(
  data, 
  class = "compact",
  filter = "top", 
  extensions = c("Buttons"),
  options = list(
    pageLength = 5,
    columnDefs = list(
      list(targets = c(1:27, 29:31), className = "dt-center"), # text align center
      list(targets = c(28), visible = FALSE)
    ),
    buttons = list(
      list(extend = "colvis", columns = c(1:31))
      ),

    dom = "Bpiltf"
  ),
  caption = "Table 1:"
)

calculate new variables

data2 <- data %>%
  filter(fatalities > 0) %>%
  group_by(year) %>%
  mutate(
    total_fata = sum(fatalities),
    
    total_inci = n(),
    
    ## incidents
    # Political violence rates
    political_rate = round(
      sum(total_inci[event_type %in% c("Battles", "Protests", "Explosions/Remote violence", "Violence against civilians")]) /
        sum(total_inci) * 100),
    
    # Violence against civilian rates
    civilian_rate = round(
      sum(total_inci[event_type == "Violence against civilians"]) / sum(total_inci) * 100),
    
    # Exchange of territory
    non_state_exchange = round(
      sum(total_inci[sub_event_type == "Non-state actor overtakes territory"]) / sum(total_inci) * 100),
      
    govt_regain_exchange = round(
      sum(total_inci[sub_event_type == "Government regains territory"]) / sum(total_inci) * 100),
    
    
    ## fatalities
    # Political violence rates
    political_rate = round(
      sum(total_fata[event_type %in% c("Battles", "Protests", "Explosions/Remote violence", "Violence against civilians")]) /
        sum(total_fata) * 100, 2),
    
    # Violence against civilian rates
    civilian_rate = round(
      sum(total_fata[event_type == "Violence against civilians"]) / sum(total_fata) * 100, 2),
    
    # Exchange of territory
    non_state_exchange = round(
      sum(total_fata[sub_event_type == "Non-state actor overtakes territory"]) / sum(total_fata) * 100, 2),
      
    govt_regain_exchange = round(
      sum(total_fata[sub_event_type == "Government regains territory"]) / sum(total_inci) * 100, 2)
    
  ) %>%
  ungroup()
final <- data2 %>%
  select(-time_precision, -assoc_actor_1, -assoc_actor_2, -geo_precision, -source_scale, -timestamp, -tags, 
         -population_1km, -population_2km, -population_5km, -population_best)

4.2 Distribution Analysis

add dropbox to filter by event_type/ sub_event_typesub_event_type

boxplot

https://managementsystemsintl.github.io/methods-corner/Exploring%20ACLED/ExploringACLED.html

ggplot(final, aes(x = forcats::fct_infreq(admin1), y = event_date, color = factor(admin1), fill = factor(admin1))) +
  geom_sina(method = "density", alpha = .3) +
  geom_boxplot(width = .2, color = "#000000", fill = NA, size = .5, outlier.shape = NA, position = position_nudge(.25)) +
  coord_flip()+
  theme(legend.position = "none", 
        plot.title.position = "plot") +
  ggtitle("Frequency of Conflict Has Increased Over Time in Most Administrative Regions"
          , subtitle = "add subtitle") +
  labs(y = "Year (2010-2023)"
       , x = ""
       , caption = "Data Source: ACLED (2023)")

4.3 Trend Analysis

Calendar Heatmap

derive new fields, wkday, month

calendar <- final %>%
  filter(fatalities > 0) %>%
  group_by(year, event_date, admin1) %>%
  mutate(
    wkday = weekdays(event_date),
    day = mday(event_date),
    month = factor(months(event_date), levels = rev(month.name)),
    week = isoweek(event_date)
  ) %>%
  ungroup()
cal_conflict <- calendar %>%
  group_by(year, day, month, admin1) %>%
  summarise(total_fata = sum(fatalities)) %>%
  ungroup()
# tooltip
tooltip_heat <- paste("<b>", cal_conflict$day, " ", cal_conflict$month, " ", cal_conflict$year, "</b>", 
                      "\nFatalities : ", cal_conflict$total_fata)

heat <- ggplot(cal_conflict, aes(x = day, y = month, fill = total_fata)) + 
  geom_tile(color = "white", size = 1, aes(text = tooltip_heat)) + 
  theme_tufte(base_family = "Helvetica") + 
  coord_equal() +
  scale_fill_gradient(name = "No. of Armed Conflicts", low = "#fff2f4", high = "lightcoral") +
  facet_grid(year~.) +
  labs(x = "Days of Month", 
       y = "Month/Year", 
       title = "Armed Conflicts",
       subtitle = "INSERT SUBTITLE",
       caption = "Data Source: ACLED (2023)") +
  theme(axis.ticks = element_blank(),
        axis.text.x = element_text(size = 7),
        plot.title = element_text(hjust = 0.5),
        panel.border = element_rect(color = "grey", fill = NA, size = 0.5),
        legend.title = element_text(size = 8),
        legend.text = element_text(size = 6),
        legend.position = "top")
heat

# Convert ggplot to plotly (to include custom tooltip)
heat_plotly <- ggplotly(heat, tooltip = "text")

# Add caption
heat_plotly <- heat_plotly %>% 
  layout(
  annotations = list(
    text = "Data Source: ACLED (2023)",
    x = 1.1,
    y = -0.2,
    showarrow = FALSE,
    xref = "paper",
    yref = "paper"
  )
)

heat_plotly

Trend line

year_fata <- final %>%
  filter(fatalities > 0) %>%
  group_by(year) %>%
  select(year, fatalities) %>%
  summarise(total_fata = sum(fatalities),
            total_inci = n()) %>%
  ungroup()

hc_plot1 <-  highchart() %>% 
  hc_add_series(year_fata, hcaes(x = as.factor(year), y = total_fata), type = "line", 
                name = "Total Fatalities", color = "lightcoral") %>%
  hc_add_series(year_fata, hcaes(x = as.factor(year), y = total_inci), type = "line", 
                name = "Total Incidents", color = "black") %>%
  hc_tooltip(crosshairs = TRUE, borderWidth = 1.5, headerFormat = "", 
             backgroundColor = "#FCFFC5",
             borderWidth = 5,
             pointFormat = "Year: <b>{point.year}</b>
                                 <br> Fatalities: <b>{point.total_fata}</b>
                                 <br> Incidents: <b>{point.total_inci}</b>"
             ) %>%
  hc_title(text = "Armed Conflict Over The Years") %>% 
  hc_subtitle(text = "2010 to 2023") %>%
  hc_xAxis(title = list(text = "Year")) %>%
  hc_yAxis(title = list(text = "Frequency"),
           allowDecimals = FALSE,
           plotLines = list(list(
             color = "lightcoral", width = 1, dashStyle = "Dash",
             value = mean(year_fata$total_fata),
             label = list(text = paste("Average fatalities:", round(mean(year_fata$total_fata))),
             style = list(color = 'lightcoral', fontSize = 20))))) %>% 
  hc_add_theme(hc_theme_flat())
hc_plot1

Reference

Back to top